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Abstract 

Recently developed concepts and techniques of analyzing complex systems provide new insight 
into the structure of social networks. Uncovering recurrent preferences and organizational principles 
in such networks is a key issue to characterize them. We investigate school friendship networks from 
the Add Health database. Applying threshold analysis, we find that the friendship networks do 
not form a single connected component through mutual strong nominations within a school, while 
under weaker conditions such interconnectedness is present. We extract the networks of overlapping 
communities at the schools (c-networks) and find that they are scale free and disassortative in 
contrast to the direct friendship networks, which have an exponential degree distribution and are 
assortative. Based on the network analysis we study the ethnic preferences in friendship selection. 
The clique percolation method we use reveals that when in minority, the students tend to build 
more densely interconnected groups of friends. We also find an asymmetry in the behavior of black 
minorities in a white majority as compared to that of white minorities in a black majority. 
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Social structures in schools are subject to intense investigations for many obvious reasons. 
Schools visited by major part of the population form social systems, which are well defined 
units enabling to study relationships, networking and processes in a condensed way. The 
relationships of adolescents show remarkable peculiarities, they are influenced by family 
backgrounds and, at the same time, they are precursors of the future society. The problems of 
spreading sexually transmitted diseases, of drug abuse or of delinquency among adolescents 
and young adults are closely related to their social embedding in the schools and so are their 
racial/ethnic preferences. 

The investigation of patterns of friend selection is a major source of our knowledge on 
social structures in schools Mapping out the friendship networks based on questionnaires 
have been a successfull approach in this respect, where the existence and intensity of dyadic 
connections are identified using nominations of the students 0, ^,[5, 7|. It is known that sex 
and race/ethnicity are two primary characteristics on which students base their selections 
of friends [lj and here we would like to focus on the latter. 

Desegregation of schools as a function of the racial diversity has been a topic of analysis 

nn nnn 

in multi-ethnic countries in Western Europe [3, H0J and the USA [4J, |5|, [6|. These studies 
suggested that the way schools are organized could affect the level of racial friendship segre- 
gation. In recent studies of friendship networks p* and related models 8|, |9J were successfully 
used to identify how some of the attributes of the network members are correlated with their 
inclinations in choosing group relationships. However, as the measures of segregation are 
still under discussion [? ],and even racial classification schemes seem problematic 12j, we 
think it useful to approach this problem from a different angle, namely to apply concepts 
and results from the science of complex networks 
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These results include the quantitative characterization of hierarchical ordering 
new, efficient methods of community detection 
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401 ] , even of overlapping ones 



and pointing out relations between functionality and weights of the links in the network 
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ones, within this new framework 
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complex networks, including social 



Our aim here is to present an analysis of friendship networks in schools based on the 
representative US National Longitudinal Study of Adolescent Health (Add Healt, j^j). 



First we carry out a topological study and apply threshold analysis 



25| in order to identify 



the network which is most appropriate for our further investigation. In contrast to earlier 
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work, our study focuses on the communities instead of the dyadic links. Interestingly, we 
uncover that the properties of the direct friendship network are significantly different from 
the network of the next hierarchical level, namely the network of communities. 



Friendship networks 

The friendship networks presented here are constructed from the in-school questionnaires 



of the Add-Health [24| friendship nomination study from the period 1994-1995, in which 
90118 students participated. The analyzed data are limited to students who provided in- 
formation from schools with response rates of 50% or higher. Every student was given a 
paper-and-pencil questionnaire and a copy of a list with every schoolmate. Weighted dyadic 
links were generated based on the number of sheared activities. Weights were in the range 
from 1, meaning the student nominated the friend without reporting any activity, to 6 mean- 
ing that the student nominated the friend and reported participating in all five activities 
with (him/her). 

The structure files contain information on 75871 nodes divided in 84 networks (schools). 
In most of the analyzed samples of schools the majority of the population is white, however, 
there are significant fluctuations. In particular, the ratios of the races in the total popula- 
tion is the foolowing: White:0.59, Black:0.14, Hispanic:0.13, Asian:0.04 and Other:0.1. In 
Figs. [Th-b, we visualize the friendship networks for two schools with pajek 34|. Fig. [Th, is 
a characteristic sample of the 84 schools, we call it here School 1. In this school the great 
majority of the population is white (70%), which contrasts to a non-characteristic sample, 
School 2, visualized in Fig. [T](b), where blacks (40%) are overrepresented with respect to 
the average. Nodes represent students, with colors indicating their race. A link is drawn 
between nodes if at least one of the student nominates the other like a friend. The spatial 
distribution of the nodes corresponding to the different grades, placed counter clockwise, 
starting with the 7th grade at lower right corner and ending with the 12th grade. Visual 
inspection of the intergrade links already tells that there is a separation between the upper 
grades (high school) and the lower grades (middle school). While the partition according 
to the grades was introduced "by hand" the separation of colors within the 6 groups is not 
artificial; the apparent clustering of nodes according to the same color is due to the fact that 
they are more densely interconnected. 
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FIG. 1: (a)-(b) Networks of friendships from Schools 1 and 2 (respectively). Nodes represent 
students, with colors indicating their race. Spatial distribution of nodes corresponds to the different 
grades, placed counter clockwise, from 7th to 12th grade, (c) Left: G/N fraction of sites in the 
largest connected component G for the networks with mutual links only (circles) and networks 
with mutual and not mutual links (squares) versus threshold weight w c . Only links with average 
weight in both directions w > w c are kept. Right: Second moment of the normalized number of 
clusters excluding the largest component for the same analysis as in the left part. 

Role of weights and directionality 



Checking mutuality in a whole-network study [35| gives some insight into the reliability 
of the answers given to the questionnaires. In an ideal case both participants of a dyadic 
relationship should name each other with the same weight. We apply threshold analysis to 
measure the influence of weights and directionality in the links. In order to analise the role 
of the weights we take an average over all schools. 

First, we analyze the network formed only by mutual links, i.e., mutual nominations, 
which should have the more reliable information about, stronger relations or tight friendships 
inside the networks. We introduce the mean of the weight in both directions to characterize 
the weight of each link (w). We examine different thresholds of (w c ) for creating links, i.e., a 
link is created only if there is a mutual link and w > w c . The values of the weights go from 
1 to 6, the weakest possible restriction is w = 1, which includes any mutual link present 
in the network. In the left part of figure [Tfc (circles) we present the calculations of G/N, 
the fraction of nodes that belong to the largest cluster vs. w c . On the right side, ^ s s 2 ^s, 
the second moment of the normalized number of clusters n s of size s (excluding the largest 
cluster) is presented. Interestingly, when considering only mutual connections G is roughly 
half of the population, and the network is split in various components. 
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Next we make the threshold analysis by considering the network as follows: A link is 
formed if at least one nomination exists, and (w > w c ); the weight w of a link is taken again as 
the mean of the weights in both directions with the extension that for the direction into which 
the nomination does not exist, zero is taken. For this case, we find a transition as a function 
of w c : The population is disconnected into many clusters for w c > 2 while a giant component 
occurs for w c < 2. This effect is shown on both sides of FigsJTb (red squares). We have 
found that only the weakest threshold criterium and dropping the requirement of mutuality 
leads to a spanning giant component. This finding harmonizes with the finding 36| that 
applying strong criteria for constructing friendship networks leads to a network instability 
while with weak criteria the network turns out to be stable. 

In our further analysis of community detection we assume that a dyadic link exists if any 
of the corresponding students nominates the other, and we do not consider any threshold for 
the weight. Imposing the minimum restriction possible for the creation of a link allows us to 
search for communities in the interconnected giant component and to uncover preferences 
in the social relationships between the students. 



Networks of communities (c-networks) 



The social network reflects the structure of the society. Therefore it carries information 
about the building bricks, the communities. However, it is a highly non-trivial task to 
extract this information from the network itself. Communities are vaguely defined as groups 



of vertices that have a 
between the groups [37 



ligh density of edges within them, with a lower density of edges 
. The recently introduced method of community detection, the 



"clique percolation method" 



39 



40] seems particularly appropriate to handle this problem 



because it enables overlapping communities, which are typical for the social networks. Two 
communities overlap, if they share at least one member. In most of the friendship groups 
there are members, who simultaneously belong to more than one such group. This feature 
is known as affiliation (see, e.g., 8( in the social networks literature and is an aspect of large 
networks which is on one hand very important, while it has not been satisfactorily addressed 
by the recently developed (prior to the k-clique percolation approach) network clustering 
methods. 

A k-clique is a fully connected subgraph containing k nodes. A k-clique community is 
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FIG. 2: C-networks of 3-clique communities at School 1 ((a)) and School 2 ((c)). Compared to the 
corresponding c-networks of 4-clique communities ((b) and (d) respectively). The color is assigned 
according to the race of the majority of nodes in the community. The node size is proportional 
to the square root of the number of nodes in the community. Although, each community can 
have students from different races, we assign to it the color of the majority of the members of the 
community. 

defined as a group of /c-cliques that can be reached from each other through a series of 
adjacent /c-cliques sharing k — 1 nodes. After determining the /c-clique communities, it turns 
out that there are nodes which belong to more than one community. Using these shared 
nodes one can construct the c-network of communities, where the communities themselves 
constitute the c-nodes and the shared nodes of the original network form the c-links between 
them. In the following we analyze the c-network of communities based on the friendship 
networks of the schools. 

Fig.[2]shows the c-network of fc-clique communities extracted from the friendship networks 
of school 1 and school 2. Figs. [2^ and c is based on 3-clique communities of friendship 
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networks of School 1 (Fig. [Th) and school 2 (Fig. [lb), respectively. In turn, Figs. [2]d and d 
are based on 4-clique communities extracted from the same schools. The area of the circles 
represents the number of nodes within the community and each node color is related to a 
race. 

A comparison of Figs 2a-c with Figs 2b-d shows that there is a dramatic difference between 
the c-networks based on 3-clique or 4-clique communities. For the 3-clique communities we 
see in both schools complex c-networks with rich, interconnected structures, which include 
the great majority of the students, while the c-networks of 4-clique communities are rather 
sparse (less than 20% of the students belong to them) and the structures are fragmented. 

It has been suggested 39| that the optimal value of k for uncovering the community struc- 
ture in a network is the largest one which still assures percolation, i.e., interconnectedness. 
In contrast to other studied networks jsjj, like protein networks or collaboration networks, 
where the optimal value for detecting communities was k = 4 or 5, we have found that 
triads are the optimal elementary cliques for the high school friendship networks. Although 
it is shown here only for schools 1 and 2, our finding is generally valid for the whole data 
set. This is a new manifestation of the well known fact that triads play an eminent role 



in interpersonal relations 
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43( | . which is a 



clustering coefficient 48( of social networks 441] . 



so reflected in the high value of the average 



Although we obtain the richest community structure for k — 3, it is worth having a look 
at the c-networks based on the more cohesive 4-cliques. For k = 3 already the relatively less 
densely connected friendship circles show up in the analysis, while for k = 4 only the more 
strongly interconnected groups (in which each member is part of at least one 4-clique) are 
found by the method. One of the interesting aspects of such a study is that on the level of 
more cohesive groups (k = 4) the number of communities becomes balanced even for cases 
when the ratio of the sizes of the ethnic groups is far from unity (and, correspondingly, on 
the level of less cohesive groups, e.g., for k = 3, the students who are in majority, have much 
larger friendship circles). From here (see Figs. [2b and d) we conclude that when in minority, 
the students tend to form stronger ties, thus, the number of more densely interconnected 
communities becomes over-represented compared to what happens in the k = 3 case. 
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FIG. 3: Different network properties averaged over the complete dataset of schools, for the commu- 
nity networks (circles) and for friendship networks (squares): (a) Cumulative degree distribution, 
(b) Degree-dependent clustering coefficient, (c) Average degree of the nearest neighbor, (d) 
Cumulative distribution of the membership number (m) and of (d) the overlap size (s ov ) for the 
community networks. 

Statistical properties of the c-networks 

In the following we statistically characterize the structure of the friendship networks and 
of the extracted c-networks based on 3-clique communities, where averages will be taken 
over all 84 schools in the data set. 

The cumulative degree distribution P(n) is defined as the fraction of nodes having degree 
larger than n. In Fig. [3^l we show P(n) for the friendship networks and compare it with the 
cumulative degree distribution of the c-networks of communities. The distribution for the 
friendship networks rapidly decreases, indicating that these networks have a characteristic 
degree. This corresponds to the natural cutoff in the number of friends, in accordance with 
the results reported for another friendship network 33J . Interestingly, the degree distribution 
of the c-networks is much broader, and can be well fitted by a scale free, power-law function 
of the form ~ rT 1 with 7 « 1.5. It is known that such scale free networks emerge from growth 
processes where an effective preferential attachment, i.e., a " rich get richer" mechanism is at 



play 13l | . Scale free c-networks have already been seen before 4l|, but the transition from 
the rapidly decaying degree distribution in friendship network to the scale freeness of the 
c-networks is a relevant characteristic of social community formation and should be taken 



into account for the formulation of models of large social networks 29]. 
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The degree distribution provides information about the dyadic relations while the clus- 
tering coefficient characterizes the triads. The local clustering coefficient (C*) of a vertex i 
with degree rij, is defined as the ratio of the number of triangles connected to it and all the 
possible number of triangles (n^n, — l)/2). The mean degree-dependent clustering coeffi- 
cient is the average of the local clustering over all vertices with degree n. This quantity is 
analyzed for the two types of networks and presented in Fig. [3b. For the friendship networks 
C(n) varies slightly with n for most of the observed n-range; decaying rapidly only for larger 
degrees. Again, C(n) for the c-network is much broader than for the friendship network and 
can be reasonably fitted by a power law C(n) ~ n~ a , with a ~ 2.8. This kind of dependence 
of the clustering coefficient as an inverse power of the node degree, can be signature of a 



hierarchical structure of the networks 
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16 



47|. 



Social networks are known to be assortative, i.e., high degree nodes are linked with 
enhanced probability. The statistical analysis of this effect relies on the degree n(n) of nearest 
neighbors averaged over all nodes of degree n. For assortative (disassortative) network n(n) is 
a monotonously increasing (decreasing) function of n. As expected, the friendship networks 
turn out to be assortative (see Fig. [3b), but in contrast to networks with scale free degree 
distribution (e.g., collaboration networks), n(n) has also a cutoff due to the rapid decay in 
the degree distribution. On the other hand, the c- networks are disassortative, i.e., n(n) can 
be approximated by a power law with a negative exponent, n(n) ~ n~^, with (3 ~ 1.1. 

We also calculate the membership (m) of each student, which is the number of com- 
munities that the students belongs to. Fig. (3U displays the cumulative distribution of the 
membership number P(m), which shows that on average, each student belongs to a limited 
number of communities (less than 5). In turn, any two communities can share s ov nodes, 
which defines the overlap size between these communities. Fig. [3^ shows the average of the 
overlap distribution for all the schools, which is well fitted by a power law with the expo- 
nent 2.9. We can conclude that students belong to at most 4 different clique-communities 
inside the School, and that there is no characteristic overlap size in the networks (except of 
that given by their finite size). Absence of characteristic membership number and overlap 
size have been observed in other social and biological networks but not in their randomized 
versions 39(. Additionally, the clustering coefficient, (C) for friendship networks and for 



community networks both have a similar average value near 0.3, which is larger than an 
equivalent random graph with the same number of nodes and links. 
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FIG. 4: Measuring preferences of inter-racial connections r — r'. P(r, r') is the relative frequency 
of directed links from Whites (full green line), Blacks (dotted black line), Hispanics (dashed red 
line) and Asians (dashed-dotted yellow line) to each of the races r' = W (circles), B (squares), 
H (diamonds), and A (triangles). Racial preferences manifest themselves as systematic deviations 
of the ratio P(r,r')/P r (r,r') from 1, P r is the corresponding relative frequency in the randomized 
samples, (a) P/P r in decreasing order from 1 to 4, for the nominations made from r to r' . (b) 
The corresponding Z-scores. The combination of (a) and (b) reveals relations r — r' that are 
significantly absent. The results are the average over the 84 School networks. 

Ethnic preferences 

Racial/ethnic preferences in friendship selection contain crucial information about the 
level of segregation, which constitutes one of the major sources of social conflicts. Quantify- 
ing such concepts as preferences or segregation and to work out the appropriate measurement 



protocols are highly non-trivial tasks in a strongly inhomogeneous society (see |12| ? ]). 

We use the following quantitative method to measure 'preferential' nominations as a 
function of the attributes of the students. A nomination can be considered preferential, if 
pairs of nodes with given attributes are significantly more recurrent within the empirical 
networks than those in their randomized versions. In the studied sample of friendship 
networks, we find the dominant appearance of quantitatively preferential nominations among 
students of the same race, as a manifestation of homophily present in each grade and common 
to each racial group from all schools. Here we present in detail the measure of preferences 
in the School networks as a function of the race known for the nodes, without separating 
the information by grade. The same method can, of course, be used to measure preferences 
in any attributes. 

In each directed network we identify the frequency of the 25 possible race dyads, formed 
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from the 5 races attributed to the nodes. To focus on those dyads that are significantly 
recurrent, we compare the real network to suitably randomized networks. 

The randomized networks have the same single node characteristics as those the real 
networks: Each node in the randomized network keeps its race and the same number of 
incoming and outgoing edges as the corresponding node has in the real network. For ran- 
domizing the networks we employ a Markov- chain algorithm, based on starting with the real 



network and repeatedly swapping randomly chosen pairs of connections 
is replaced by A — > D, C — > B) until the network is randomized 31 



(A -> B, C -> D 



321 ] . Switching is 



prohibited if either of the connections A and D or C and B already exist. Thus the degree 
of each node is preserved. 

In Fig. H] we present results for the main 4 races identified at the schools: white, black, 
hispanic and asians. For each race r, we calculate the relative frequency P(r, r') directed links 
r — > r', to a node with race r'. The presented results are the average over the 84 schools. The 
comparison to randomized networks compensates for the effects of differences in the amount 
of each race population. Racial preferences manifest themselves as systematic deviations 
of the ratio P(r,r')/ < P r (r,r') > from 1. The common behavior for each racial group is 
to nominate friends of the same race (intra-ethnic nominations) more likely than students 
from any of the other race (inter-ethnic nominations). In Fig. [4^, we present Pj < P r > in 
decreasing order from 1 to 4, for the nominations made for each race r (denoted by different 
line styles and colors) the race of the nominated nodes r' (indicated by different symbols). 
Not only the preference for intra-ethnic nominations becomes clear from this plot, but also 
that symmetrically some inter-ethnic nominations are found 4 times less often than in the 
randomized versions, e.g., those from asians <-> blacks and blacks <-> whites. In Fig. [4b, we 
characterize the significance of the deviations by the Z-scores, defined as: 

Z(ry)= P{ry) -< P -f< r ' ]> , (1) 
a r (a, a') 

where a r (r,r') is the standard deviation of < P r (r,r') > calculated from 100 realizations of 
randomized networks. The combination of these two plots reveals relations r <-» r' that are 
significantly absent. 

Next, we illustrate how the measured quantity P(r,r')/ < P r (r,r') >, can be used to 
obtain certain characteristics of the friendship selection preference as a function of the racial 
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FIG. 5: The ratio of the relative frequencies (P/P r ) vs. fraction of the minority, i.e. black 
population (/&). (a) For white — ► white and white — ► black nominations, (b) For black — ► Wac/c 
and Wac/c — > white nominations. Pj < P r > can be fitted by a negative power law of the form 
/r a , with a = 0.6. For black «-> Macfc nominations and a = 0.5 for Wac/c <-> white nominations. 
This shows that although heterogeneity decreases the relative frequency of b <-> b , it does not favor 
inter-ethnic relations b <-> iu. 

composition of the schools. In the following, we focus on the relations of two ethnic groups: 
blacks (b) and whites (w). In Fig. [5] we represent the obtained value of P/ < P r > vs. the 
fraction of the minority (/&), i.e. students of the black population in each school network. 
Figure [5^ shows the values for the nominations from whites, intra-ethnic w — > w and inter- 
ethnic w — > 6. Equivalently, Fig. [5b shows the corresponding nominations from blacks b — > 6 
and 6 — > w. These figures show a sample of 64 schools which have at least 0.2% of any of 
both races (white and black). 

We have observed that intra-ethnic nominations occur equally or more frequently than in 
the randomized networks (P/ < P r >> 1), while inter-ethnic nominations are less likely to 
occur {Pj < P r >< 1), and these results do not depend on the total size of the population, 
N (not shown). When we plot the same quantities as a function of the fraction of the 
minority it is possible to extract some relevant tendencies from the entire sample. Note 
that Pj < P r > vs. fa, for b — ► b is greater than 1 and tends to 1 only when fa ~ 1 
(top of Fig. [5]d), just for such values Pj < P r > of w — > w is then considerably greater 
than 1 (top of Fig. [5^). These figures show that both races present the following behavior: 
When the population of a given race, is majority (fraction / ~ 1), then their intra-ethnic 
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nominations resemble those of the randomized networks Pj < P r >~ 1, but when they 
represent a minority (/ < 1) such populations tend to make intra-ethnic nominations of 
friends (P/ < P r » 1). 

In contrast to the intra-ethnic relations, the inter-ethnic nominations are non-symmetric 
with respect to the composition. This is clearly shown in and b). It is natural that 
the b — > w and w — > b follow the same pattern. However, in case of symmetric behavior 
the limes ft, — > 1 and fbtoO should be similar. Instead, we see a monotonous dependence of 
Pj < P r > which can be well fitted by a negative power-law of the form with a ~ 0.5. 
The figures Eh) and b) indicate that when blacks are in a small minority, the frequency of 
the inter-ethnic relations correspond to an almost perfect desegregation, while in the other 
extreme, when whites are in a small minority, extremely strong segregation occurs. Our 
results suggest the following picture: Both whites and blacks show increasing homophily as 
their get into minorities. However, blacks as a small minority in a white majority get more 
integrated than the other way around. This result points toward the finding that the increase 
of racial heterogeneity does not necessarily favor the inter-ethnic nominations among the 
increasing minority and the race of the majority, but may have the opposite effect j^]. 



Conclusions 



In this article we have applied network concepts and tools to investigate the social struc- 
ture of schools. We used the Add Health data base 24j which contains - among others 



- detailed data about friendship nominations, ge, gender, etc. We have first ana- 

lyzed the weighted friendship network where the weight of a link between students % j 
corresponds to the number of sheared activities of i with j as nominated by i. We have 
found striking asymmetries in the nominations and concluded that the community struc- 
tures can be best uncovered if the underlying networks are chosen with the weakest criteria 
(one nomination in either direction already results in a link). 

We have presented the statistical properties of these networks. The community struc- 
ture was studied by means of /c-clique percolation and the c-network of communities was 
constructed using overlap generated links. The optimal clique size was found to be k = 3 
in agreement with the special role of triads in social interactions. While the friendship net- 
works show the expected assortativity and their degree distribution have a sharp cutoff, the 
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c-networks are disassortative and they have a scale-free degree distribution. 

Finally, we presented a statistical analysis of ethnic preferences in friendship selection 
based on a comparison of the relative frequencies of r — r' links as compared to a randomized 
reference system. We have analyzed the preference order of the four major ethnic groups. 
Furthermore we concluded that very small black minorities in a white majority have better 
balanced inter-ethnic relations than a small white minority in a black majority. This could 
be related to the non-trivial effect of increasing ethnic heterogeneity on desegregation. 

This research has been supported by grants OTKA T049674 and K60456. MCG thanks 
DAAD for financial support. HJH thanks the Max Planck Prize. 
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